Floating-Point Typedefs Having Specified Widths - N3626
نویسندگان
چکیده
It is proposed to add to the C++ standard optional floating-point typedefs having specified widths. The optional typedefs include float16_t, float32_t, float64_t, float128_t, their corresponding least and fast types, and the corresponding maximumwidth type. These are to conform with the corresponding specifications of binary16, binary32, binary64, and binary128 in IEEE_ floating-point format. The optional floating-point typedefs having specified widths are to be contained in a new standard library header . They will be defined in the std namespace. New C-style macros to facilitate initialization of the optional floating-point typedefs having specified widths from floating-point literal constants are proposed. It is not proposed to make any mandatory changes to , special functions, , or . The main objectives of this proposal are to: • Extend the benefits of specified-width typedefs for integer types to floating-point types. • Improve floating-point safety and reliability by providing standardized typedefs that behave identically on all platforms. • Optionally extend the range of floating-point to lower and to higher precision. • Provide a Standard way of specifying 128-bit precision. 2 Floating-Point Typedefs Having Specified Widths N3626 XML to PDF by RenderX XEP XSL-FO Formatter, visit us at http://www.renderx.com/
منابع مشابه
Floating-Point Typedefs Having Specified Widths - N1703
It is proposed to add to the C++ standard optional floating-point typedefs having specified widths. The optional typedefs include float16_t, float32_t, float64_t, float128_t, their corresponding least and fast types, and the corresponding maximumwidth type. These are to conform with the corresponding specifications of binary16, binary32, binary64, and binary128 in IEEE_ floating-point format. T...
متن کاملA Floating-Point Unit for Arithmetic Operations
In this paper we present a design for a floating point unit partially compliant with the IEEE 754 floating point standard. The unit fully implements comparisons and partially implements floating-point addition and multiplication. It is fully parametrized and may be used with floating point numbers whose composite fields have widths of any desired length.
متن کاملChecking Compatibility of Bit Sizes in Floating Point Comparison Operations
We motivate, define and design a simple static analysis to check that comparisons of floating point values use compatible bit widths and thus compatible precision ranges. Precision mismatches arise due to the difference in bit widths of processor internal floating point registers (typically 80 or 64 bits) and their corresponding widths when stored in memory (64 or 32 bits). The analysis guarant...
متن کاملPrecision Modeling and Bit-width Optimization of Floating-Point Applications
We present a floating-point precision modeling methodology that can be used to develop application adaptive arithmetic precision models for variable bitwidth floating-point computing. We also developed optimization algorithms that minimize the total bit-width for the application such that the output accuracy meets user-defined requirements. The methodology supports different bit-widths for diff...
متن کاملThe New IEEE-754 Standard for Floating Point Arithmetic
The IEEE-754 standard for Floating Point Arithmetic[1] that was in effect at the time of this seminar was adopted in 1985. That standard was intended for hardware implementation, although provisions were made for software implementation for operations. In addition to required operations, an appendix of recommended functions was also specified. Default exception handling was specified, and provi...
متن کامل